Ref 
# 


Hits 


Search Query 


DBS 


Default 
Operator 


Plurals 


Time Stamp 


LI 


45 


(document adj attribute) and (search 
or query) and key$lword and @ad < 
"20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 16:37 


L2 


9 


707/lO.ccls. and 11 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 16:37 


L3 


12 


707/3.ccls. and 11 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 16:37 


L4 


7 


707/4.ccls. and 11 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 16:37 


SI 


3 


("6449598" or "6675159").pn. or 
"20010049677" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 22:23 




7 


(search same criteria) and (l<eyword 
same match) and (document same 
attribute) and ((scal$4 or adjust$4 or 
normaliz$4) same factor) 


1 IC D^DI ID ■ 

Ub-rorUB; 

USPAT; 

JPO 


UR 


Orr 


2004/07/13 12:54 


S3 


11 


(search same criteria) and (keyword 
same match) and (document same 
attribute) and (multiple same links) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:29 


S4 


0 


(search same criteria) and (document 
same attribute) and (incoming same 
multiple same links) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:16 


S5 


9 


(search same criteria) and (incoming 
same multiple same links) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:19 


S6 


32 


(search same criteria) and (number 
same Incoming same links) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:39 


S7 


0 


("number of links" or "number of 
incoming links") same (search near2 
criteria) and documents 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:29 


S8 


0 


(("number of links" or "number of 
incoming links") same (search same 
criteria)) and documents 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:28 


S9 


0 


(("number of links" or "number of 
incoming links") and (search same 
criteria)) and documents 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:28 


SIO 


0 


(("number of links" or "number of 
incoming links") and (search same 
criteria)) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:28 


Sll 


0 


("number of links" or "number of 
incoming links") 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:29 
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S12 


4 


((number adj links) or (number nearZ 
incoming near2 links)) same (search 
nearZ criteria) and documents 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 10:30 


S13 


1 


"20020198875" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:23 


S14 


1 


(search same criteria) and (keyword 
same match) and (document same 
attribute) and (readability near index) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 11:10 


S15 


3 


(search same criteria) and (keyword 
same match) and (document same 
attribute) and (readability) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 11:15 


S16 


1 


(readability adj index) same search 
same document 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 11:16 


S17 


25 


(readability adj index) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 11:17 


CI o 

bio 


15 


(^searcn same criteriaj ana (^Keywora 
same match) and ((scal$4 or adjust$4 
or normaliz$4) same factor) and 
adjust$4 


1 IC Dr'DI ID- 

USPAT; 
JPO 


UK 


Urr 




S19 


1 


"5742816".pn. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 14:12 


S20 


1071 


(standard adj deviation) same (square 
adj root) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 14:59 


S21 


0 


((standard adj deviation) same 
(square adj root)) and (readability 
same index) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 14:58 


S22 


1 


((standard adj deviation) same 
(square adj root)) and (search same 
document) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 14:58 


S23 


7 


(search same document) same 
(square adj root) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:02 


S24 


39 


(search same document) same 
(square ) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:05 


S25 


1 


(search same document) and 
(readability same index) same (search 
same criteria) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:05 


S26 


5 


(search same document) and 
(readability ) same (search same 
criteria) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:05 
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S27 


8 


(search same criteria) and (l<eyword 
same match) and ((scal$4 or adjust$4 
or normaliz$4) same factor) and 
adjust$4 and off$lset 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:26 


S28 


23 


(search same criteria) and (l<eyword 
same match) and adjust$4 and 
off$lset 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/13 15:27 


S29 


1 


"20030088545" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/19 12:13 


S30 


1 


"20020138487" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/19 12:25 


S31 


1 


"5742816".pn. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2004/07/19 12:25 


S32 


1 


"20010042587" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:23 


S33 


6 


(search adj criteri$2) and (Iceyword 
same match) and (document same 
attribute) and (multiple same linlcs) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:46 


S34 


5 


(search adj criteri$2) and (Iceyword 
same match) and (document same 
attribute) and (multiple same 
links)and @ad < "20010520" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 16:53 


S35 


0 


(search adj cnten$2) and (keyword 
same match) and (document near2 
attribute) and (multiple same 
linlcs)and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:35 


S36 


32 


(search adj criteri$2) and (keyword 
same match) and (document same 
attribute) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 14:02 


S37 


1 


(establish$4 same (search adj 
criteri$2)) and (keyword same match) 
and (document same attribute) and 
@ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:47 


S38 


1 


((creat$4 or establish$4) near3 
(search adj criteri$2)) and (keyword 
same match) and (document same 
attribute) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:57 


S39 


12 


((creat$4 or establish$4) same (search 
adj criteri$2)) and (keyword same 
match) and (document same 
attribute) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 13:57 


S40 


32 


(search adj criteri$2) and (keyword 
same match) and (document same 
attribute) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 14:03 
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S41 


21 


scor$3 same (search adj criteria) same 
database and @ad < "20010621" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 16:54 


S42 


1 


"6738759".pn. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/09 18:32 


S43 


1657 


(search same criteria) and (number 
same links) 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:40 


S44 


55 


S43 and 707/4.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:41 


S45 


99 


S43 and 707/l.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:41 


S46 


6 


S44 and 707/l.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:41 


S47 


99 


S45 and 707/l.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:42 


S48 


233 


S43 and 707/3.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:42 


S49 


22 


S43 and 707/9.ccls. 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/02/10 07:42 


S50 


159 


(search with document with 
attribute$l) and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:20 


S51 


35 


((search adj criter$3) with document 
with attribute$l) and @ad < 
"20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:34 


S52 


0 


(((search adj criter$3) with document 
with attribute$l) same ranl<$4) and 
@ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:21 


S53 


0 


(((search adj criter$3) with document 
with attribute$l) and rank$4) and 
@ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:21 


S54 


0 


((search adj criter$3) with document 
with attribute$l) and rank$4 and @ad 
< "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:21 


S55 


0 


((search adj criter$3 with exclu$4) 
same (document with attribute$l)) 
and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:35 


S56 


0 


((search with criter$3 with exclu$4) 
same (document with attribute$l)) 
and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:35 
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S57 


0 


((search same criter$3 same exclu$4) 
same (document with attribute$l)) 
and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:36 


S58 


5 


((search same criter$3 same exclu$4) 
and (document with attribute$l)) and 
@ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:37 


S59 


2 


((search same criter$3 same 
exclusive) and (document with 
attribute$l)) and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:40 


S60 


5 


((search same criter$3 same exclu$5) 
and (document with attribute$l)) and 
@ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:50 


S61 


4 


((search with criter$3) same exclu$5) 
and (document with attribute$l) and 
@ad < "20010520" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:52 


S62 


1 


(search with exclu$5) same 
(document with attribute$l) and @ad 
< "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:54 


S63 


1129 


(search with exclu$5) and @ad < 
"20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:54 


S64 


46 


(search with exclu$5) same (search 
with criteria) and @ad < "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:54 


S65 


19 


(search with exclu$5) same (search 
with criteria) and document and @ad 
< "20010620" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 21:54 


S66 


4 


("6449598" or "6675159" or 
"6738759").pn. or "20010049677" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 22:24 


S67 


4 


( "6675159" or "6738759").pn. or 
"20010049677" or "20020198875" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/07/12 22:25 


S68 


0 


(dcoument adj attribute) and @ad < 
"20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2005/01/10 15:15 


S69 


301 


(document adj attribute) and @ad < 
"20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 15:16 


S70 


55 


(document adj attribute) and 
key$lword and @ad < "20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 15:17 


S71 


45 


(document adj attribute) and (search 
or query) and l<ey$lword and @ad < 
"20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 16:37 


S72 


4 


(document adj attribute) and scor$4 
and criter$4 and (search or query) and 
key$lword and @ad < "20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 15:30 
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S73 


11 


(document adj (attribute or 
metadata)) and scor$4 and criter$4 
and (search or query) and key$lword 
and @ad < "20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 15:33 


S74 


11 


((document or (web with page)) adj 
(attribute or metadata)) and scor$4 
and criter$4 and (search or query) and 
key$lword and @ad < "20011203" 


US-PGPUB; 

USPAT; 

JPO 


OR 


OFF 


2006/01/10 15:33 
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Google Advanced Search 



Page 1 of 1 



GOOSlC Advanced Search 



Advanced Search Tips | About Google 



Find results 



with all of the words 
with the exact phrase 



se arch crit erion ke yword mate} |100 results ! ▼ 



[document a ttribute 



Google Search 



with at least one of the words | 
without the words \ 



Language 
File Format 
Date 



Return pages written in 



|any language 



[ Only :> I return results of the file format | any format 
Return web pages updated in the [anytim e 



Occurrences Return results where my terms occur [ anyw here in th e page ▼ 

[ — ^ — ^ 



Domain 

Usage 
Rights 

SafeSearch 



[Only 



return results from the site or domain ' 



e.g. google, com, .org More info 



Return results that are 



not filtered by license 



More info 

® No filtering C Filter using SafeSearch 



Page-Specific Search 

Similar Find pages similar to the page 

Links Find pages that link to the page 



li Search 



e.g. www.googlexom/help.html 



J I Search 



Topic-Specific Searches 



Google Book Search - Search the full text of books 
Google Scholar - Search scholarly papers 



Apple Macintosh - Search for all things Mac 

BSD Unix - Search web pages about the BSD operating system 

Linux - Search all penguin-friendly pages 

Microsoft - Search Microsoft-related pages 



U.S. Government - Search all .gov and .mil sites 
Universities - Search a specific school's website 



©2006 Google 



http://www.google.corn/advanced_search?q=search+criterion+keyword+^ 1/10/06 



search criterion keyword match score scaling "document attribute" - Google Search 
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Sign in 

Web Imag es Groups News Frooqie Local ^^^' more » 



search criterion keyword match score scaling 'j | Search 



Advanced Search 
Preferences 



WGb Results 1 - 77 of about 89 for search criterion keyword match score scaling " document attribute ". (0. 

[PS] Efficient Encodings for Document Ranking Vectors Taher H „■ 
File Format: Adobe PostScript - View as Text 

For efficient keyword-search query processing over large document repositories, 
... For a large-scale search engine supporting millions of users, ... 
www. Stanford. eduMaherh/papers/encoding-pagerank.ps - Similar pages 



[PDF] Efficient Encodings for Document Ranking Vectors 
File Format: PDF/Adobe Acrobat - View as HTML 

cient keyword-search query processing over large document ... the context of 

large-scale Web search. An excellent overview ... 

www. Stanford. eduMaherh/papers/encoding-pagerank.pdf - Similar pages 

iPDFi SearchVis 

File Format: PDF/Adobe Acrobat - View as HTML 

create a new query which narrows down the search result set keywords are made 
... between the contents of an information space and N search criteria. ... 

www.iicm.edu/thesis/smayr.pdf - Simila r pages 



[DOC] Document No. 

File Format: Microsoft Word 6 - View as HTML 

Detection criteria may be stated as Boolean keyword criteria and ... Priority scores 
can be weighted, based on where in the document the match occurs, ... 
www.itl.nist.gov/iaui/894.02/ related_projects/tipster/docs/req201-doc - Similar pages 

[PDF] CREATING A SYNTACTIC DOCUMENT ONTOLOGY 
File Format: PDF/Adobe Acrobat - View as HTML 

Each document attribute reveals a unique aspect of the documents and helps with 
infor-. mation display, identification, search and classification. ... 

etda.libraries.psu.edu/theses/ approved/Wor!dWideFiles/ETD-652/mythesis.pdf - Similar pages 

[PDF] Aviator Aviator Aviator Aviator Administrator Guide Administrator .,. 
File Format: PDF/Adobe Acrobat - View as HTML 

Search - advanced keyword and attribute searches simplify access to files that 
would not be ... Level /. Criteria. Score (MB per month). Processor ... 

www.aviatorsoftware.com/tech/docs/adminguide.pdf - Sim il ar pag es 

[PS] Geometric Layout Analysis Techniques for Document Image ,., 
File Format; Adobe PostScript - View as Text 

Another interesting example is DAFS (Document Attribute Format Specification) 
... The criterion to optimize and prune the search is based on the cumulative ... 
tev.ltc.it/people/modena/Papers/DOC_SEGstate.ps.gz - Simi lar pages 

[PDF] Geometric Layout Analysis Techniques for Document Image .,■ 
File Format: PDF/Adobe Acrobat - View as HTML 

Another interesting example is DAFS (Document Attribute For- ... The criterion 
to optimize and prune the search is based on the cumulative ... 
tev.itc.it/people/modena/Papers/DOC_SEGstate.pdf - Similar pages 



http://www.google.com/search?num=100&hl=en&lr=&as_qdr=all&q=search-hcriteri^ 1/10/06 
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[PDFi Human Computer Interaction Development & Management 
File Format: PDF/Adobe Acrobat - View as HTML 

search for a certain Item: browsing, directed search, and exact match using, 
keywords. During experimentation, the performance of each subject is measured ... 
biblio.unap.cl/PDF/ IRM%20Press%20-%202002%20-%20Human%20Computer%20lnteraction% 
20Development%20and%20Mana... - Similar pages 

[PDF] biblio.unap.cl/biblio/PDF/IRM%20Press%20-%202002%2... 
Fiie Format: PDF/Adobe Acrobat - View as HTML 
Supplemental Result - Similar pa ges 

[PDF] Interaction Harvesting for Document Retrieval by Noah S. Fields 
File Format: PDF/Adobe Acrobat - View as HTML 

keyword strategies require an exact keyword match, keyword expansion is a. 
technique where one keyword is associated with many search terms, and has ... 
noah.cx/thesis24.pdf - Supplemental Result - Similar pages 

Havellwala, Taher H.: Efficient Encodings for Document Ranking Vectors 
for Efficient keyword-search query processing over large Document repositories, 
... for a large-scale search engine supporting millions of users, ... 

dbpubs.stanford.edu:8090/pub/2002-58 - 60k - Supplemental Result - Cached - Similar pages 

[PDF] A user-centered interface for information exploration in a ,., 
File Format: PDF/Adobe Acrobat 

issuing the keyword queries "cryptography" and "neural, networks" to two Web 
search services and two computer-, science citation search services, ... 

dx.doi.org/10.1002/ (SICI)1Q97--4571 (2000)51 :3%3C297::AID>ASI8%3E3.0.CO:2-N - Simil a r pages 

[PDF] www.derwent.com/derwenthome/media/productpdfs/itpp... 
File Format: PDF/Adobe Acrobat - View as HTML 
Supplemental Result - Sinn.ilaLBages 

[PDF] Document No. 

File Format: PDF/Adobe Acrobat - View as HTML 

Criteria used for retrospective search and for Routing shall have the same ... 
Detection Criteria may be stated as Boolean Keyword Criteria and negative ... 
ayre.ca/library/tipster/req201.pdf - Supplemental Result - Simila r pages 

[PDF] Deliverable 

File Format: PDF/Adobe Acrobat - View as HTML 

Fuzzy match. • Keyword Search. • select the maximum Number of records to return 
... Compass Server 3.0 is designed to scale to large intranets, with support ... 

www.en.eun.org/eun.org2/ eun/html/mm1010/public/d06_4_3.pdf - Supplemental Result - Similar pages 

[PDF] MICROSOFT WORD 97 FOR WINDOWS SAMPLE DISSERATATION: 
File Format: PDF/Adobe Acrobat - View as HTML 

Recently, TEXT-search extension Modules have become available FOR object- ... 
THE keyword Boolean, forces any match with nonzero score value TO succeed. ... 
purl.fcla.edu/fcia/etd/UFE0000346 - Supplemental Result - Sjmilar pages 

[PDF] Geonnetric Layout Analysis Techniques for Document Image „■ 
File Format: PDF/Adobe Acrobat - View as HTML 

is classified, portrait or landscape, by a class majority criterion Among the. 

nine underlying squares taking into account the classification scores, skew ... 

www.ee.bgu.ac.il/.. ./ David_Cahana_Geometric%20Layout%20Analysis%20Techniques%20-%20a% 
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20Review.pdf - Supplemental Result - Similar pages 
EDGAR Online Pro | Text Section 

Our SEARCH engine is designed to enable users to formulate and refine queries 
using a series of information retrieval methods including keyword, thesauri, ... 

pro.edgar-online.com/ipo/textSection.asp?cikid=6852& fnid=24826&IPO=0&sec=bd&coname=VERITY.,. - 52k - 
Supplemental Result - Cached - Similar pages 

[PDF] Efficient Encodings for Document Ranking Vectors 
File Format: PDF/Adobe Acrobat - View as HTML 

cient keyword-search query processing over large Document ... The context of 
large-scale Web search, an excellent overview ... 

vww-db. Stanford. eduMaherh/ papers/encoding-pagerank.pdf - Supplemental Result - SMilar^pages 

[PDF] www.music.mc g iH.ca/-ich/classes/munrit611 05/Presen,.. 
File Format: PDF/Adobe Acrobat - View as HTML 
Supplemental Result - Similar pages 

[PDF] A user-centered interface for information exploration in a „■ 
File Format: PDF/Adobe Acrobat 

a relevance score. In the realm of information sources, the Stanford InfoBus. 
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Keywords: GIS data integration, approximate methods, heterogeneous data sources 



Multidocument summarization: An added value to clustering in interactive retrieval 
Manuel J. Ma^-Lffiez, Manuel De Buenaga, Jose M. Gomez-Hidalgo 
April 2004 ACM Transactions on Information Systems (TOIS), volume 22 issue 2 
Publisher: ACM Press 

Full text available* Additional Information: full citation , abstract , references , index ternns . 
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A more and more generalized problem in effective information access is the presence in 
the same corpus of multiple documents that contain similar information. Generally, users 
may be interested in locating, for a topic addressed by a group of similar documents, one 
or several particular aspects. This kind of task, called instance or aspectual retrieval, has 
been explored in several TREC Interactive Tracks. In this article, we propose in addition to 
the classification capacity of clustering techn ... 

Keywords: Multidocument summarization, topic segmentation 



Paper session 4: XML query processing: Best-match querying from document-centric 
XML 

Jaap Kamps, Maarten Marx, Maarten de Rijke, Borkur Sigurbjornsson 
June 2004 Proceedings of the 7th International Workshop on the Web and 
Databases: colocated with ACM SIGMOD/PODS 2004 WebDB '04 

Publisher: ACM Press 

Full text available: ^ pdf(277.47 KB) Additional Information: full citation , abstract , references , citings 

On the Web, there is a pervasive use of XML to give lightweight semantics to textual 
collections. Such document-centric XML collections require a query language that can 
gracefully handle structural constraints as well as constraints on the free text of the 
documents. Our main contributions are three-fold. First, we outline two fragments of 
XPath tailored to users that have varying degrees of understanding of the XML structure 
used, and give both syntactic and semantic characterizations of these ... 

Keywords: XML retrieval, XPath, full-text XML querying 



Learning to match ontologies on the Semantic Web 

AnHai Doan, Jayant Madhavan, Robin Dhamankar, Pedro Domingos, Alon Halevy 
November 2003 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume 12 issue 4 
Publisher: Springer-Verlag New York, Inc. 

Full text available:^ pdf(263.63 KB ) Additional Information: full citation , abstract , index terms 

On the Semantic Web, data will inevitably come from many different ontologies, and 
information processing across ontologies is not possible without knowing the semantic 
mappings between them. Manually finding such mappings is tedious, error-prone, and 
clearly not possible on the Web scale. Hence the development of tools to assist in the 
ontology mapping process is crucial to the success of the Semantic Web. We describe 
GLUE, a system that employs machine learning techniques to find such m ... 

Keywords: Machine learning, Ontology matching, Relaxation labeling, Semantic Web 



InfoCrystal: a visual tool for information retrieval & managennent 
Anselm Spoerri 

December 1993 Proceedings of the second international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available: ^ pdf(999.84 KB) Additional Information: full citation , references , citings, index terms 
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Ji-Rong Wen, Qing Li, Wei-Ying Ma, Hong-Jiang Zhang 
March 2003 ACM SIGMOD Record, volume 32 issue i 

Publisher: ACM Press 

Full text available: ^ pdf(524,08 KB) Additional Information: full citation , abstract , references , citings 

To truly meet the requirements of multimedia database (MMDB) management, an 
integrated framework for modeling, managing and retrieving various kinds of media data 
in a uniform way is necessary. MediaLand is an experimental MMDB platform being 
developed at Microsoft Research Asia for users with different levels of experiences and 
expertise to manage and search multimedia repositories easily, efficiently, and 
cooperatively. Key features of MediaLand include a uniform data model for describi ... 

Keywords: media independence, multi-paradigm querying, multimedia database 
management, uniform data modeling 



Real-tinne shadin g 

^ Marc Olano, Kurt Akeley, John C. Hart, Wolfgang Heidhch, Michael McCool, Jason L. Mitchell, 
^ Randi Rost 

August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 
'04 

Publisher: ACM Press 

Full text available: ^ pdf(7.39 MB) Additional Information: full citation , abstract 

Real-time procedural shading was once seen as a distant dream. When the first version of 
this course was offered four years ago, real-time shading was possible, but only with one- 
of-a-kind hardware or by combining the effects of tens to hundreds of rendering passes. 
Today, almost every new computer comes with graphics hardware capable of interactively 
executing shaders of thousands to tens of thousands of instructions. This course has been 
redesigned to address today^s real-time shading capabili ... 

12 Approximate query mapping: Accounting for translation closeness 
Kevin Chen-Chuan Chang, Hector Garcfa-Molina 

September 2001 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume lO issue 2*3 

Publisher: Springer-Verlag New York, Inc. 

Full text available: ^ pdf(661.19 KB) Additional Information: full citation , abstract , index terms 

In this paper we present a mechanism for approximately translating Boolean query 
constraints across heterogeneous information sources. Achieving the best translation is 
challenging because sources support different constraints for formulating queries, and 
often these constraints cannot be precisely translated. For instance, a query [score>8] 
might be "perfectly" translated as [rating>0.8] at some site, but can only be 
approximated as [grade=A] at another. Unlike other work, our ... 

Keywords: Approximate query translation, Closeness, Constraint-mapping, Information 
integration, Mediators 



Data clustering: a review 
A. K. Jain, M. N. Murty, P. J. Flynn 

September 1999 ACM Computing Surveys (CSUR), volume 3i issue 3 
Publisher: ACM Press 

Full text available* pdf(636 24 KB) ^^^'^'O^^l Information: full citation , abstract , references , citings, index 
. [Aj : terms , review 
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Clustering is the unsupervised classification of patterns (observations, data Items, or 
t feature vectors) Into groups (clusters). Tlie clustering problem has been addressed in 

many contexts and by researchers In many disciplines; this reflects its broad appeal and 
usefulness as one of the steps In exploratory data analysis. However, clustering Is a 
difficult problem comblnatorlally, and differences In assumptions and contexts In different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, 
incremental clustering, similarity indices, unsupervised learning 



14 Extractin g relational data fronn HTML repositories 

Ruth Yuee Zhang, Laks V. S. Lakshmanan, Ruben H. Zamar 
December 2004 ACM SIGKDD Explorations Newsletter volume 6 issue 2 
Publisher: ACM Press 

Full text available: ^ pdf(271.48 KB) Additional Information: full citation , abstract , references 

There is a vast amount of valuable Information in HTML documents, widely distributed 
across the World Wide Web and across corporate intranets. Unfortunately, HTML Is mainly 
presentation oriented and hard to query. In this paper, we develop a system to extract 
desired information (records) from thousands of HTML documents, starting from a small 
set of examples. Duplicates In the result are automatically detected and eliminated. We 
propose a novel method to estimate the current coverage of results ... 

Keywords: coverage estimation, duplication. Information extraction, pattern 




15 Selected IR-Related Dissertation Abstracts 

Susanne M. Humphrey 
^ November 1990 ACM SIGIR Forum, volume 24 issue 3 

Publisher: ACM Press 

Full text available: ^ pdf ( 2.27 MB ) Additional Information: full citation , abstract 

The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using Dialog Information Services, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen 
by the author, and abstract. Unless otherwise sped ... 

'1 6 S pecial issue on knowledge representation 
Ronald J. Brachman, Brian C. Smith 
February 1980 ACM SIGART Bulletin, issue 70 

Publisher: ACM Press 

Full text available: ^ pdf(13.13 MB ) Additional Information: full citation , abstract 

In the fall of 1978 we decided to produce a special issue of the SIGART Newsletter 
devoted to a survey of current knowledge representation research. We felt that there 
were twe useful functions such an issue could serve. First, we hoped to elicit a clear 
picture of how people working in this subdiscipline understand knowledge representation 
research, to Illuminate the Issues on which current research Is focused, and to catalogue 
what approaches and techniques are currently being developed. Secon ... 
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November 2005 Proceedings of the 13th annual ACM international conference on 
i^ultimedia MULTIMEDIA '05 

Publisher: ACM Press 

Full text available: ^ pdf(633.40 KB) Addlt'onal Information: full citation , abstract , references , index terms 

In this paper, we propose a novel autonnatic approach for personalized music sports video 
generation. Two research challenges, semantic sports video content selection and 
automatic video composition, are addressed. For the first challenge, we propose to use 
multi-modal (audio, video and text) feature analysis and alignment to detect the semantic 
of events in sports video. For the second challenge, we propose video-centric and music- 
centric music video composition schemes to automatically generate ... 

Keywords: automatic video editing, event detection, personalized music sports video, 
sports video analysis, video content selection 



18 Findin g expertise and information: Searching for expertise in social networks: a 




simulation of potential strate g ies 
Jun Zhang, Mark S. Ackerman 

November 2005 Proceedings of the 2005 international ACM SIGGROUP conference on 
Supporting group worl< GROUP '05 

Publisher: ACM Press 

Full text available: ^ pdf(1.85 MB) Additional Information: full citation , abstract , references , index terms 

People search for people with suitable expertise all of the time in their social networks - to 
answer questions or provide help. Recently, efforts have been made to augment this 
searching. However, relatively little is known about the social characteristics of various 
algorithms that might be useful. In this paper, we examine three families of searching 
strategies that we believe may be useful in expertise location. We do so through a 
simulation, based on the Enron email data set. (We would be u ... 

Keywords: CSCW, computer-supported cooperative work, expertise finding, expertise 
location, expertise sharing, information seeking, organizational simulations, social 
computing, social networks 




19 Information storage and retrieval: a survey and functional description 
Jack Minker 

September 1977 ACM SIGIR Forum, volume 12 issue 2 
Publisher: ACM Press 

Full text available: ^ pdf(5.14 MB) Additional Information: full citation , abstract , references 

Information Storage and Retrieval (IS&R) encompasses a broad scope of topics ranging 
from basic techniques for accessing data to sophisticated approaches for the analysis of 
natural language text and the deduction of information. Within the field, three general 
areas of investigation can be distinguished not only by their subject matter but also by 
the types of individuals presently interested in them:(l) Document retrieval, (2) 
Generalized data management, and(3) Question-answering.A functional ... 

Keywords: automatic indexing, data management, data structures, deductive search, 
information retrieval, natural language, problem solving, question-answering, relational 
data systems, theorem proving 



20 S patial querying for innage retrieval: a user-oriented evaluation 

Joemon M. Jose, Jonathan Furner, David J. Harper 
^ August 1998 Proceedings of the 21st annual international ACM SIGIR conference on 
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21 Link-based ranking 2: Searching the workplace web 

Ronald Fagin, Ravi Kumar, Kevin S. McCuriey, Jasmine Novak, D. Sivakumar, John A. Tomlin, 



David P. Williamson 

May 2003 Proceedings of the 12th international conference on World Wide Web 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citin gs, index 
terms 



Full text available: g pdf (231 .55 KB ) 



The social impact from the World Wide Web cannot be underestimated, but technologies 
used to build the Web are also revolutionizing the sharing of business and government 
information within intranets. In many ways the lessons learned from the Internet carry 
over directly to intranets, but others do not apply. In particular, the social forces that 
guide the development of intranets are quite different, and the determination of a "good 
answer" for intranet search Is quite different than on the Int ... 

22 Database session 7: bioinformatics: Information extraction from biomedical literature: 
^ methodology, evaluation and an application 

^ L. Venkata Subramaniam, Sougata Mukherjea, Pankaj Kankar, Biplav Srivastava, Vishal S. 
Batra, Pasumarti V. Kamesam, Ravi Kothari 

November 2003 Proceedings of the twelfth international conference on Information 
and knowledge management 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: 1 p^pdf(261. 28 KB) 



Journals and conference proceedings represent the dominant mechanisms of reporting 
new biomedical results. The unstructured nature of such publications makes it difficult to 
utilize data mining or automated knowledge discovery techniques. Annotation (or markup) 
of these unstructured documents represents the first step in making these documents 
machine analyzable. In this paper we first present a system called BioAnnotator for 
identifying and annotating biological terms in documents. BioAnnotator ... 

Keywords: biological document processing, information extraction 
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December 1986 Communications of the ACM, volume 29 issue 12 
Publisher: ACM Press 

Full text available: iSi pdf(1 .04 MB) Additional Information: full citation , abstract , references , dtiogs. index 
' — ■ terms , review 

A new Innplennentation of free-text search using a new parallel computer— the Connection 
Machine®— makes possible the application of exhaustive methods not previously feasible 
for large databases. 

Conference abstracts 

January 1977 Proceedings of the 5th annual ACM computer science conference 
Publisher: ACM Press 
Full text available: ^ p df (3.14 MB) Additional Information: full citation , abstract , index terms 

One problem in computer program testing arises when errors are found and corrected 
after a portion of the tests have run properly. How can it be shown that a fix to one area 
of the code does not adversely affect the execution of another area? What is needed is a 
quantitative method for assuring that new program modifications do not introduce new 
errors into the code. This model considers the retest philosophy that every program 
instruction that could possibly be reached and tested from the ... 

25 Notes Explorer: entity-based retrieval in shared, semi-structured information spaces 
^ Scott Huffman, Catherine Baudin 

November 1996 Proceedings of the fifth international conference on Information and 
knowledge management 

Publisher: ACM Press 

Full text available: ^ pdf(951.75 KB) Additional Information: full citation , references , citing s, index terms 



26 Web document clustering: a feasibility demonst r atio n 
Oren Zamir, Oren Etzioni 

August 1998 Proceedings of the 21st annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ^ pdf(1.43 MB) Additional Information: full citation , references . citinQS . index terms 



A semantic-based a p proach to component retrieval 
Vijayan Sugumaran, Veda C. Storey 
August 2003 ACM SIGMIS Database, volume 34 issue 3 
Publisher: ACM Press 

Full text available: ^j)df{367.67 KB) Additional Information: full ci tation , abstract, reference s, citings, index 

There continues to be a great deal of pressure to design and develop information systems 
within a short period of time. This urgency has reinvigorated research on software reuse, 
particularly in component based software development. One of the major problems 
associated with component-based development is the difficulty in searching and retrieving 
reusable components that meet the requirement at hand. In part, this problem exists 
because of the lack of sophisticated query methods and techniques. ... 

Keywords: component based development, domain model, ontology, reuse repository, 
systems development 
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28 Machine learnin g in automated text categorization 
Fabrizio Sebastiani 

March 2002 ACM Computing Surveys (CSUR), volume 34 issue i 
Publisher: ACM Press 

Full text available* 151 pdf(524 41 KB) Additional Information: full citation , abstract , references , citings, index 

' terms 

The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the Increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the dominant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning, from 
a set of preclassified documents, the characteristics of the categories. ... 

Keywords: Machine learning, text categorization, text classification 



29 Information retrieval session 8: efficiency: Online duplicate document detection: | 
^ signature reliability in a dynamic retrieval environment 

^ Jack G. Conrad, Xi S. Guo, Cindy P. Schriber 

November 2003 Proceedings of the twelfth international conference on Information 

and knowledge management 
Publisher: ACM Press 

I- II * * -I ui 0k ^(/nAc 0-7 i^D\ Additional Information: full citation , abstract , references , citin gs, index 

Full text available: Tu pdf{21 5.37 KB) ^ 

^ ^^^^^ 

As online document collections continue to expand, both on the Web and in proprietary 
environments, the need for duplicate detection becomes more critical. Few users wish to 
retrieve search results consisting of sets of duplicate documents, whether identical 
duplicates or close matches. Our goal in this work is to investigate the phenomenon and 
determine one or more approaches that minimize its impact on search results. Recent 
work has focused on using some form of signature to characterize a do ... 

Keywords: data management, doc signatures, duplicate document detection 

30 Oral session 2: web searching and applications: Similarity space projection for web 
^ ima ge search and annotation 

^ Ying Liu, Tao Qin, Tie-Yan Liu, Lei Zhang, Wel-Ying Ma 

November 2005 Proceedings of the 7th ACM SIGMM international workshop on 

Multimedia information retrieval MIR '05 
Publisher: ACM Press 

Full text available:^ pdf(589.39 KB) Additional Information: full citation , abstract , references , index terms 

Web image search has been explored and developed in academic as well as commercial 
areas for over a decade. To measure the similarity between Web images and user queries, 
most of the existing Web image search systems try to convert an image to textual 
keywords by analyzing the textual Information available (such as surrounding text and 
image filename) with or without leveraging image visual features (such as color, texture, 
shape). In this way, the existing systems transform "Web images" to the ... 

Keywords: image annotation, similarity space projection, web image search 

31 S pecial issue on ICML: Coupled clusterin g : a method for detectin g structural 
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Zvika Marx, Ido Dagan, Joachim 1^. Buhmann, Eli Shamir 
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March 2003 The Journal of Machine Learning Research, volume 3 
Publisher: MIT Press 

Full text available:^ pdf(967.15 KB) Additional Information: full citation , abstract , citin gs, index terms 

This paper proposes a new paradignn and a connputational framework for revealing 
equivalencies (analogies) between sub-structures of distinct composite systems that are 
Initially represented by unstructured data sets. For this purpose, we introduce and 
investigate a variant of traditional data clustering, termed coupled clustering, which 
outputs a configuration of corresponding subsets of two such representative sets. We 
apply our method to synthetic as well as textual data. Its achievement ... 

32 Sup portin g efficient multimedia database exploration 
Wen-Syan Li, K.Selguk Candan, Kyoji Hirata, Yoshinori Hara 

April 2001 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 9 Issue 4 

Publisher: Springer-Verlag New York. Inc. 

Full text available: ^ pdf(569.30 KB) Additional Information: full citation , abstract , index terms 

Due to the fuzziness of query specification and media matching, multimedia retrieval is 
conducted by way of exploration. It is essential to provide feedback so that users can 
visualize query reformulation alternatives and database content distribution. Since media 
matching is an expensive task, another issue is how to efficiently support exploration so 
that the system Is not overloaded by perpetual query reformulation. In this paper, we 
present a uniform framework to represent statistical inform ... 

Keywords: Exploration, Human computer interaction. Multimedia database, Progressive 
processing. Query relaxation, Selectivity statistics 



33 Effect of different network anal ysis strate gies on search en g ine re-rankin g | 
Behnak Yaltaghian, Mark H. Chignell 

October 2004 Proceedings of the 2004 conference of the Centre for Advanced Studies 
on Collaborative research 

Publisher: IBM Press 

Full text available: Q pdf ( 253.28 KB ) Additional Information: full citation, abstract , references , index terms 

The research described in this paper examined two different approaches to building the 
co-citation network that the authors have used in re-ranking the set of results returned by 
a search engine [22, 23]. The more computationally demanding (In terms of query load) 
Inter- or Web-wide co-citation approach used in-llnks from throughout the Web to build 
the network. In contrast, the Intra co-citation approach only used inllnks Inferred from 
search engine output. Results of this study confirmed th ... 

34 Constructi ng multi-g ranular and topic-focused web site ma ps | 
Wen-Syan Li, Necip Fazil Ayan, Okan Kolak, Quoc Vu, Hajime Takano, Hisashi Shimamura 
April 2001 Proceedings of the 10th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: ^ pdf (3.18 MB) Additional Information: full citation , references , citings , index terms 



Keywords: decision tree algorithm, logical domain, multi-granularity, site map, topic 
distillation 



35 Fast multiresolution image query ing 
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Keywords: content-based retrieval, image databases, image indexing, image metrics, 
query by content, query by example, similarity retrieval, sketch retrieval, wavelets 



36 Applying summarization techniques for term selection in relevance feedback 
Adenike M. Lam-Adesina, Garetli J. F. Jones 

September 2001 Proceedings of the 24th annual international ACM SIGIR conference 
on Research and development in information retrieval 

Publisher: ACM Press 

Full text available* 155 pdf(253.28 KB) A*^*^'^'^"^' Information: full citation , abstract , references , citin gs, index 
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Query-expansion is an effective Relevance Feedback technique for improving performance 
in Information Retrieval. In general query-expansion methods select terms from the 
complete contents of relevant documents. One problem with this approach is that 
expansion terms unrelated to document relevance can be introduced into the modified 
query due to their presence in the relevant documents and distribution in the document 
collection. Motivated by the hypothesis that query-expansion terms should ... 



37 Research track papers: Probabilistic author-topic models for information discovery 




Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, Thomas Griffiths 
August 2004 Proceedings of the tenth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '04 

Publisher: ACM Press 

Full text available: ^ p df (323.72 KB) Additional Information; full citation , abstract , references , index terms 

We propose a new unsupervised learning technique for extracting information from large 
text collections. We model documents as if they were generated by a two-stage stochastic 
process. Each author is represented by a probability distribution over topics, and each 
topic Is represented as a probability distribution over words for that topic. The words in a 
multi-author paper are assumed to be the result of a mixture of each authors* topic 
mixture. The topic-word and author-topic distributions are ... 

Keywords: Gibbs sampling, text modeling, unsupervised learning 



38 Evaluatin g topic-driven web crawlers 

Filippo Menczer, Gautam Pant, Padmini Srinivasan, Miguel E. Ruiz 

September 2001 Proceedings of the 24th annual international ACM SIGIR conference 
on Research and development in information retrieval 

Publisher: ACM Press 

Full text available: H pdf(210.09 KB) Additional Information: full citation, abstract, references, citbgs, index 
" - terms 

Due to limited bandwidth, storage, and computational resources, and to the dynamic 
nature of the Web, search engines cannot index every Web page, and even the covered 
portion of the Web cannot be monitored continuously for changes. Therefore it is essential 
to develop effective crawling strategies to prioritize the pages to be indexed. The issue is 
even more important for topic-specific search engines, where crawlers must make 
additional decisions based on the relevance of visited pages. ... 
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Keywords: InfoSpiders, PageRank, Web information retrieval, best-first search, focused 
crawlers, performance metrics, topic driven crawling 



39 Maintenance and evolution: UMLDiff: an algorithm for object-oriented desig n 
^ differencing 

Zhenchang Xing, Eleni Stroulia 

November 2005 Proceedings of the 20th IEEE/ACM international Conference on 
Automated software engineering ASE '05 

Publisher: ACM Press 

Full text available:^ pdf (287.60 KB) Additional Information: full citation , abstract , references , index terms 

This paper presents UMLDiff, an algorithm for automatically detecting structural changes 
between the designs of subsequent versions of object-oriented software. It takes as input 
two class models of a Java software system, reverse engineered from two corresponding 
code versions. It produces as output a change tree, i.e., a tree of structural changes, that 
reports the differences between the two design versions in terms of (a) additions, 
removals, moves, renamings of packages, classes, interfaces ... 

Keywords: design differencing, design mentoring, design understanding, structural 
evolution 



40 Question answering: Question answering passage retrieval using dependency 
relations 

Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, Tat-Seng Chua 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^ pdf(3Q4.92 KB) Additional Information: full citation , abstract , references , index terms 

State-of-the-art question answering (QA) systenns employ term-density ranking to 
retrieve answer passages. Such methods often retrieve incorrect passages as 
relationships among question terms are not considered. Previous studies attempted to 
address this problem by matching dependency relations between questions and answers. 
They used strict matching, which fails when semantically equivalent relationships are 
phrased differently. We propose fuzzy relation matching based on statistical models. 
We ... 

Keywords: dependency parsing, passage retrieval, question answering 
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41 Global digital museum: nnultimedia information access and creation on the Internet 
Junichi Takahashi, Takayuki Kushida, Jung-Kook Hong, Shigeharu Sugita, Yasuyuki Kurita, 
Robert Rieger, Wendy Martin, Geri Gay, John Reeve, Rowena Loverance 
May 1998 Proceedings of the third ACM conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^ pdf(1.41 MB) Additional Information: full citation , references , citing s, index terms 



42 A market- based ap proach to recommender s y stems 
Van Zheng Wei, Luc Moreau, Nicholas R. Jennings 

July 2005 ACM Transactions on Information Systems (TOIS), volume 23 issue 3 
Publisher: ACM Press 

Full text available: ^ pdf(2.06 MB) Additional Information: full citation , abstract , references , index terms 

Recommender systems have been widely advocated as a way of coping with the problem 
of information overload for knowledge workers. Given this, multiple recommendation 
methods have been developed. However, it has been shown that no one technique is best 
for all users in all situations. Thus we believe that effective recommender systems should 
incorporate a wide variety of such techniques and that some form of overarching 
framework should be put in place to coordinate the various recommendations so ... 

Keywords: Recommender systems, auctions, marketplace 



43 Research sessions: query processing II: Efficient k-NN search on vertically | 
^ decomposed data 

Arjen P. de Vries, Nikos Mamoulis, Niels Nes, Martin Kersten 

June 2002 Proceedings of the 2002 ACM SIGMOD international conference on 
Management of data 

Publisher: ACM Press 

Full text available:^ pdfd. 26 MB ) Additional Information: full citation , abstract , references , index terms 

Applications like multimedia retrieval require efficient support for similarity search on 
large data collections. Yet, nearest neighbor search is a difficult problem in high 
dimensional spaces, rendering efficient applications hard to realize: index structures 
degrade rapidly with increasing dimensionality, while sequential search is not an attractive 
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solution for repositories with millions of objects. This paper approaches the problem from 
a different angle. A solution is sought in an unconvent ... 

44 Web search 2: A study of relevance propag ation for web search 

Tao Qin, Tie-Yan Liu, Xu-Dong Zhang, Zheng Chen, Wei-Ying Ma 
^ August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available:^ pdf ( 391 .03 KB) Additional Information: full citation , abstract , references , index terms 

Different from traditional information retrieval, both content and structure are critical to 
the success of Web information retrieval. In recent years, many relevance propagation 
techniques have been proposed to propagate content information between web pages 
through web structure to improve the performance of web search. In this paper, we first 
propose a generic relevance propagation framework, and then provide a comparison 
study on the effectiveness and efficiency of various representative pro ... 

Keywords: hyperlink based score propagation, hyperlink based term propagation, 
relevance propagation, sitemap based score propagation, sitemap based term propagation 



45 S pecial issue on special feature: An introduction to variable and feature selection 
Isabelle Guyon, Andre Elisseeff 

March 2003 The Journal of Machine Learning Research, volume 3 
Publisher: MIT Press 

Full text available:^ pclf (862.82 KB) Additional Information: full citation , abstract , citings, index terms 

Variable and feature selection have become the focus of nnuch research in areas of 
application for which datasets with tens or hundreds of thousands of variables are 
available. These areas include text processing of internet documents, gene expression 
array analysis, and combinatorial chemistry. The objective of variable selection is three- 
fold: improving the prediction performance of the predictors, providing faster and more 
cost-effective predictors, and providing a better understanding of the ... 

Information retrieval al g orithms: a surve y 
Prabhakar Raghavan 

January 1997 Proceedings of the eighth annual ACM-SIAM symposium on Discrete 
algorithms 

Publisher: Society for Industrial and Applied Mathematics 

Full text available:^ pdf (908.76 KB) Additional Information: full citation , references , citin gs, index terms 



Scalable algorithnns for nninin g large da tabases 
Rajeev Rastogi, Kyuseok Shim 

August 1999 Tutorial notes of the fifth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Publisher: ACM Press 

Full text available: ^ pdf(4.11 MB ) Additional Information: full citation , references , citin gs, index terms 



4S Cross-lang uag e: Bootstra p ping dictionaries for cross-lang ua ge information retrieval Q 
Kernel Marko, Stefan Schuiz, Olena Medelyan, Udo Hahn 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
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Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^ pdf (13Q.96 KB) Additional Information: full citation , abstract , references , index terms 

The bottleneck for dictionary-based cross-language information retrieval is the lack of 
comprehensive dictionaries, In particular for many different languages. We here introduce 
a methodology by which multilingual dictionaries (for Spanish and Swedish) emerge 
automatically from simple seed lexicons. These seed lexicons are automatically generated, 
by cognate mapping, from (previously manually constructed) Portuguese and German as 
well as English sources. Lexical and semantic hypotheses are then ... 

Keywords: cross-language information retrieval, lexical acquisition 



Selected IR-Related Dissertation Abstracts 
May 1991 ACi^ SIGIR Forum, volume 25 issue i 
^ Publisher: ACM Press 

Full text available: ^ pdf(2.71 MB ) Additional Information: full citation , abstract 

The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen 
by the author, and abstract. Unless otherwise spec ... 

Hy pertext for the electronic library?: CORE sample results 

Dennis E. Egan, Michael E. Lesk, R. Daniel Ketchum, Carol C. Lochbaum, Joel R. Remde, 
Michael Littman, Thomas K. Landauer 

September 1991 Proceedings of the third annual ACM conference on Hypertext 
Publisher: ACM Press 

Full text available: ^pdf(1.30 MB) Additional Information: full citation , references , citin gs, index terms 



Keywords: hypertext design, information retrieval 



Co nt ent awareness in a file system interface: implementin g the "pile" metaphor for 
organizing information 

Daniel E. Rose, Richard Mander, Tim Oren, Dulce B. Ponceleon, Gitt Salomon, Yin Yin Wong 
July 1993 Proceedings of the 16th annual international ACi^ SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: tgl pdf(906.87 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

The pile is a new element of the desktop user interface metaphor, designed to support the 
casual organization of documents. An interface design based on the pile concept 
suggested uses of content awareness for describing, organizing, and filing textual 
documents. We describe a prototype implementation of these capabilities, and give a 
detailed example of how they might appear to the user. We believe the system 
demonstrates how content awareness can be not only used in a computer filing syst ... 

52 S pecial issue on on inductive lo g ic pro g rammin g : Learning semantic lexicons from a | 
part-of-speech and semanticall y tag ged corpus usin g inductive lo gic pro g rammin g 
Vincent Claveau, Pascale Sebillot, Cecile Fabre, Pierrette Bouillon 
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December 2003 The Journal of Machine Learning Research, volume 4 
Publisher: MIT Press 

Full text available:^ pdf ( 215.86 KB) Additional Information: full citation , abstract , references , index terms 

This paper describes an inductive logic programnriing learning method designed to acquire 
from a corpus specific Noun-Verb (N-V) pairs— relevant in information retrieval 
applications to perform Index expansion— in order to build up semantic lexicons based on 
Pustejovsky's generative lexicon (GL) principles (Pustejovsky, 1995). In one of the 
components of this lexical model, called the <em>qualia structure</em>, words are 
described in terms of semantic roles. For example, the <em&g ... 



53 An Internet-based negotiation server for e-commerce 

Stanley Y.W. Su, Chunbo Huang, Joachim Hammer, Ylhua Huang, Halfel Li, Liu Wang, 

Youzhong Liu, Charnyote Pluempitlwiriyawej, l^insoo Lee, Herman Lam 

August 2001 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume 10 Issue 1 
Publisher: Springer-Verlag New Yorl<, Inc. 

Full text available: Q pdf(355.19 KB) Additional Information: full citation , abstract , citings, index terms 

This paper describes the design and implementation of a replicable, Internet-based 
negotiation server for conducting bargaining-type negotiations between enterprises 
involved in e-commerce and e-buslness. Enterprises can be buyers and sellers of 
products/services or participants of a complex supply chain engaged in purchasing, 
planning, and scheduling, l^ultipie copies of our server can be Installed to complement the 
services of Web servers. Each enterprise can Install or select a trusted negotia ... 

Keywords: Constraint evaluation. Cost- benefit analysis, Database, E-commerce, 
Negotiation policy and strategy. Negotiation protocol 



54 Abstracting of le g al cases: the SALOMON experience 
^ Marle-Francine i^oens, Caroline Uyttendaele, Jos Dumortier 

>^ June 1997 Proceedings of the 6th international conference on Artificial intelligence 
and law 

Publisher: ACM Press 

Full text available:^ pdf(1.24 MB) Additional Information: full citation , references , citings , index terms 



55 Capturing , indexing, clusterin g , and retrieving systenn history | 
Ira Cohen, Steve Zhang, Moises Goldszmidt, Julie Symons, Terence Kelly, Armando Fox 
October 2005 ACM SIGOPS Operating Systems Review , Proceedings of the twentieth 
ACM symposium on Operating systems principles SOSP '05, volume 39 issue 

5 

Publisher: ACM Press 

Full text available: ^ pdf(516.41 KB) Additional Information: full citation , abstract , references , index terms 

We present a method for automatically extracting from a running system an indexable 
signature that distills the essential characteristic from a system state and that can be 
subjected to automated clustering and similarity-based retrieval to identify when an 
observed system state is similar to a previously-observed state. This allows operators to 
identify and quantify the frequency of recurrent problems, to leverage previous diagnostic 
efforts, and to establish whether problems seen at dif ... 

Keywords: bayesian networl<s, clustering. Information retrieval, performance objectives, 
signatures 
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56 Structure and transformation of documents: Simple and accurate feature selection for Q 
^ hierarchical categorisation 
^ Wahyu Wibowo, Hugh E. Williams 

November 2002 Proceedings of the 2002 ACM symposium on Document engineering 

Publisher: ACM Press 

Full text available: ^ pdf(161.40 KB) Additional Information: full citation , abstract , references , index terms 

Categorisation of digital documents is useful for organisation and retrieval. While 
document categories can be a set of unstructured category labels, some document 
categories are hierarchically structured. This paper investigates automatic hierarchical 
categorisation and, specifically, the role of features in the development of more effective 
categorisers. We show that a good hierarchical machine learning-based categoriser can be 
developed using small numbers of features from pre-categorised tra ... 

Keywords: categorisation, error reduction, hierarchical categorisation, web hierarchies 



57 S pecial issue on learnin g from imbalanced datasets: A multistrate gy a pproach for 
digital text categorization from imbalanced documents 
M. Dolores del Castillo, Jose Ignacio Serrano 

June 2004 ACM SIGKDD Explorations Newsletter, volume 6 issue i 
Publisher: ACM Press 

Full text available: ^ pdf(338.63 KB) Additional Information: full citation , abstract , references , citings 

The goal of the research described here is to develop a multistrategy classifier system 
that can be used for document categorization. The system automatically discovers 
classification patterns by applying several empirical learning methods to different 
representations for preclassified documents belonging to an imbalanced sample. The 
learners work in a parallel manner, where each learner carries out its own feature 
selection based on evolutionary techniques and then obtains a classification mode ... 

Keywords: feature selection, genetic algorithms, multistrategy learning 




58 Multimedia: Boosted decision trees for word recognition in handwritten document 
retrieval 

Nicholas R. Howe, Toni M. Rath, R. Manmatha 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^ pdf(1 70.01 KB) Additional Information: full citation , abstract , references , index terms 

Recognition and retrieval of historical handwritten material is an unsolved problem. We 
propose a novel approach to recognizing and retrieving handwritten manuscripts, based 
upon word image classification as a key step. Decision trees with normalized pixels as 
features form the basis of a highly accurate AdaBoost classifier, trained on a corpus of 
word images that have been resized and sampled at a pyramid of resolutions. To stem 
problems from the highly skewed distribution of class frequencies, ... 

Keywords: adaboost, decision theory, handwriting retrieval, historical manuscripts 




59 Information systems outsourcing: a survey and analysis of the literature 
Jens Dibbern, Tim Goles, Rudy Hirschheim, Bandula Jayatilaka 
November 2004 ACM SIGMIS Database, volume 35 issue 4 

Publisher: ACM Press 




http://portaLacm.org/resultsxfm?query=%2Bsearch%20%2Bcriterion%20%2Bkeyword%2 1/10/06 



Results (page 3): n-search +criterion +keyword +match +document +score +attribute 



Page 6 of 6 



Full text available: ^ Ddff1.51 MB^ Additional Information: full citation , abstract , references 

In the last fifteen years, academic research on information systems (IS) outsourcing has 
evolved rapidly. Indeed the field of outsourcing research has grown so fast that there has 
been scant opportunity for the research community to take a collective breath, and 
complete a global assessment of research activities to date. This paper seeks to address 
this need by exploring and synthesizing the academic literature on IS outsourcing. It 
offers a roadmap of the IS outsourcing literature, highligh ... 

Keywords: determinants, literature review, outcomes, outsourcing, relationships, 
research approaches, theoretical foundations 



Research track pa pers: Mining the space of g raph properties 
Glen Jeh, Jennifer Widom 

August 2004 Proceedings of the tenth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '04 

Publisher: ACM Press 

Full text available:^ pdff255.01 KB) Additional Information: full citation , abstract , references , index terms 

Existing data mining algorithms on graphs look for nodes satisfying specific properties, 
such as specific notions of structural similarity or specific measures of link-based 
importance. While such analyses for predetermined properties can be effective in well- 
understood domains, sometimes identifying an appropriate property for analysis can be a 
challenge, and focusing on a single property may neglect other important aspects of the 
data. In this paper, we develop a foundation for mining the prop ... 



Keywords: data mining, graph mining 
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61 Internet and WWW-based systems: High performance crawlin g system 
Younes Hafri, Chabane Djeraba 

October 2004 Proceedings of the 6th ACM SIGMM international workshop on 
Multimedia information retrieval 

Publisher: ACM Press 

Full text available: ^ pdf(234.71 KB ) Additional Infornnation: full citation , abstract , references , index terms 

In the present paper, we will describe the design and implementation of a real-time 
distributed system of Web crawling running on a cluster of machines. The system crawls 
several thousands of pages every second, includes a high-performance fault manager, is 
platform independent and is able to adapt transparently to a wide range of configurations 
without incurring additional hardware expenditure. We will then provide details of the 
system architecture and describe the technical choices for ver ... 



Keywords: hierarchical cooperation, high availability system, web crawler 



62 Technical session 10: watermarking and multi-media processing: Thematic 
^ seg mentation of meetings throu g h document/speech ali g nment 
Dalila Mekhaldi, Denis Lalanne, Rolf Ingold 

October 2004 Proceedings of the 12th annual ACM international conference on 
Multimedia 

Publisher: ACM Press 

Full text available- Ddf(91 7 84 KB) Additional Information: full citation , abstract , references , citings , index 
'^^^ ' terms 

This article proposes a multimodal approach for segmenting meeting recordings. This bi- 
modal method takes advantages of the alignment of speech transcript with documents, in 
the context of meetings or lectures, where documents are discussed. The method first 
displays the alignment results as a set of nodes in a 2D space, where the two axes 
represent respectively the documents content and the speech transcript. The most 
connected regions in this graph are detected using a clustering method. Th ... 

Keywords: clustering techniques, document analysis, meeting dialogs structuring, 
multimedia information retrieval, multimodal thematic alignment, thematic segmentation 
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The effect of expertise on software selection 
Dennis Galletta, Ruth C. King, DIna Rateb 
IVlay 1993 ACM SIGMIS Database, volume 24 issue 2 
Publisher: ACM Press 

Full text available: ^ pclf(1.28 MB) Additional Information: full citation , abstract , index terms 

Selection of hardware and software is a complicated task involving the consideration of 
multiple criteria decision making (MCDM) as well as the expertise of the decision maker. 
Two MCDM studies examined whether or not expertise in database management systems 
(DBMS) would facilitate selection among DBMS alternatives. The results of the first study 
suggest that (I) experts seemed to exhibit more agreement on criterion weights than did 
novices, (2) experts were about twice as consistent in applying ... 

64 KM-3 (knowledge managennent): knowledge extraction: Node rankin g in labeled 
^ directed graphs 

^ Krishna P. Chitrapura, Srinivas R. Kashyap 

November 2004 Proceedings of the thirteenth ACM international conference on 
Information and knowledge management CIKM '04 

Publisher: ACM Press 

Full text available: ^ pdf(447.39 KB) Additional Information: full citation , abstract , references , index terms 

Our work is motivated by the problem of ranking hyper-linked documents for a given 
query. Given an arbitrary directed graph with edge and node labels, we present a new 
flow-based model and an efficient method to dynamically rank the nodes of this graph 
with respect to any of the original labels. Ranking documents for a given query in a 
hyper-linked document set and ranking of authors/articles for a given topic in a citation 
database are some typical applications of our method. We outline the ... 

Keywords: citation graph, context-sensitive ranking, flow-based, intranet search, link 
structure, model, pagerank, random surfer model, search, search in context, web graph 



65 Personalizing E-commerce a p plications with on-line heuristic decision makin g 
Vinod Anupam, Richard Hull, Bharat Kumar 

April 2001 Proceedings of the 10th international conference on World Wide Web 

Publisher: ACM Press 

Full text available: ^ pdf(261.12 KB) Additional Information: full citation , references , citings, index terms 



Keywords: B2C E-commerce, personalization, pro-active intervention, vortex rules 
system 



66 Theory 2: Linear discriminant model for information retrieval 
Jianfeng Gao, Haoliang Qi, Xinsong Xia, Jian-Yun Nie 

August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^ pdf(532.96 KB) Additional Information: full citation , abstract, references , index terms 

This paper presents a new discriminative model for information retrieval (IR), referred to 
as linear discriminant model (LDI^), which provides a flexible framework to incorporate 
arbitrary features. LDM is different from most existing models in that it takes into account 
a variety of linguistic features that are derived from the component models of HMI^ that is 
widely used in language modeling approaches to IR. Therefore, LDI^I is a means of 
melding discriminative and generative models for IR, We pr ... 
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67 Experiments on the determination of the relationships between terms 
^ Vijay V. Raghavan, C. T. Yu 

June 1979 ACM Transactions on Database Systems (TODS), volume 4 issue 2 
Publisher: ACM Press 

Full text available: pdf{1.52 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

The retrieval effectiveness of an automatic method that uses relevance judgments for the 
determination of positive as well as negative relationships between terms is evaluated. 
The term relationships are incorporated into the retrieval process by using a generalized 
similarity function that has a term match component, a positive term relationship 
component, and a negative term relationship component. Two strategies, query 
partitioning and query clustering, for the evaluation of the effectiv ... 

Keywords: antonym, document retrieval, feedback, pseudoclassification, semantics, 
statistical discrimination, synonym, term associations, thesaurus 



®^ S ystems: MITRE: description of the Alembic system used for MUC-6 

John Aberdeen, John Burger, David Day, Lynette Hirschman, Patricia Robinson, Marc Vilain 
November 1993 Proceedings of the 6th conference on Message understanding MUC6 
'95 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(1.14 MB) Additional Information: full citation , abstract , references 

As with several other veteran MUG participants, MITRE's Alembic system has undergone a 
major transformation in the past two years. The genesis of this transformation occurred 
during a dinner conversation at the last MUG conference, MUG-5. At that time, several of 
us reluctantly admitted that our major impediment towards improved performance was 
reliance on then-standard linguistic models of syntax. We knew we would need an 
alternative to traditional linguistic grammars, even to the somewh ... 

69 Lon g pa pers: multimodal interaction: Two-way adaptation for robust input | 
interpretation in practical multimodal conversation systems 
Shimei Pan, Siwei Shen, Michelle X. Zhou, Keith Houck 

January 2005 Proceedings of the 10th international conference on Intelligent user 
interfaces 

Publisher: ACM Press 

Full text available: ^ pdf( 666.70 KB) Additional Information: full citation , abstract , references , index terms 

Multimodal conversation systems allow users to interact with computers effectively using 
multiple modalities, such as natural language and gesture. However, these systems have 
not been widely used in practical applications mainly due to their limited input 
understanding capability. As a result, conversation systems often fail to understand user 
requests and leave users frustrated. To address this issue, most existing approaches focus 
on improving a system's interpretation capability. Nonetheless ... 

Keywords: adaptive systems, context-sensitive help, intelligent multimodal interfaces, 
multimodal input interpretation, natural language understanding, robust input 
interpretation 
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70 Clustering: ReCoM: reinforcement clustering of multi-t y pe interrelated data objects ^ 

#Jidong Wang, Huajun Zeng, Zheng Chen, Hongjun Lu, Li Tao, Wei-Ying Ma 
July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 

Research and development in informaion retrieval 
Publisher: ACM Press 

Full text available: t?| pdf(204.94 KB) Additional Information: full citation, abstract , references , citings, index 

terms 

Most existing clustering algorlthnns cluster highly related data objects such as Web pages 
and Web users separately. The interrelation among different types of data objects is 
either not considered, or represented by a static feature space and treated in the same 
ways as other attributes of the objects. In this paper, we propose a novel clustering 
approach for clustering multi-type interrelated data objects, ReCoM (Reinforcement 
Clustering of Multi-type Interrelated data objects). Under this appr ... 

Keywords: clustering, interrelated, multi-type, reinforcement 



71 Development of a measure to assess the quality of user-developed a p plications 




Suzanne Rivard, Guylaine Poirier, Louis Raymond, Frangois Bergeron 
June 1997 ACM SIGMIS Database, volume 28 issue 3 

Publisher: ACM Press 

Full text available: ^ pdf(1.15 MB) Additional Information: full citation , abstract , citings , index terms 

For several years now, software quality has been a major concern for those involved in 
the area of software engineering, and researchers as well as practitioners of the domain 
have proposed instruments to measure it. Application quality is also a concern for 
researchers and managers involved In the area of end-user computing. However, since 
end-user computing research is in a much earlier stage than research in software 
engineering, relatively few efforts have been made to assess the quality of ... 

Keywords: end-user computing, quality measurement, system quality, user development 




72 Scatt er/g at h er browsin g communicates the to pic structure of a ver y l a rg e te xt 
^ collection 

^ Peter Pirolli, Patricia Schank, Marti Hearst, Christine Diehl 

April 1996 Proceedings of the SIGCHI conference on Human factors in computing 

systems: common ground 
Publisher: ACM Press 

Full text available: ■g. pdfll .23 MB) a Additional Information: full citation , references , citings, index terms 
html(47.35 KB ) ^ 



Keywords: Scatter/Gather, browsing, clustering, information retrieval 



7^ Link-based ranking: Object-level rankin g : bringing order to Web objects 

Zalqing Nie, Yuanzhi Zhang, Ji-Rong Wen, Wei-Ying Ma 
>^ May 2005 Proceedings of the 14th international conference on World Wide Web 

Publisher: ACM Press 

Full text available: ^ pdf(888.25 KB) Additional Information: full citation , abstract , references , index terms 

In contrast with the current Web search methods that essentially do document-level 
ranking and retrieval, we are exploring a new paradigm to enable Web search at the 
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object level. We collect Web information for objects relevant for a specific application 
domain and rank these objects in terms of their relevance and popularity to answer user 
queries. Traditional PageRank model is no longer valid for object popularity calculation 
because of the existence of heterogeneous relationships between obje ... 

Keywords: PageRank, PopRank, Web information retrieval, Web objects, link analysis 



'^^ Toward a unified approach to statistical lan guag e nnodelina for Chinese 
Jianfeng Gao, Joshua Goodman, MIngjing Li, Kai-Fu Lee 

March 2002 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 1 Issue 1 
Publisher: ACM Press 

Full text available* fill pdf(1 19 MB) Additional Information: full citation , abstract , references , citin gs, index 

This article presents a unified approach to Chinese statistical language nnodeling (SLM). 
Applying SLM techniques like trigram language models to Chinese is challenging because 
(1) there is no standard definition of words in Chinese; (2) word boundaries are not 
marked by spaces; and (3) there is a dearth of training data. Our unified approach 
automatically and consistently gathers a high-quality training data set from the Web, 
creates a high-quality lexicon, segments the training data using this ... 

Keywords: Chinese language, Chinese pinyin-to-character conversion, backoff, character 
error rate, domain adaptation, lexicon, n-gram model, perplexity, pruning, smoothing, 
statistical language modeling, word segmentation 



75 Su pportin g to p-k join queries in relation al databases 
F. Ilyas, G. Aref, K. Elmagarmid 

September 2004 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume 13 Issue 3 
Publisher: Springer-Verlag New York. Inc. 

Full text available: ^ pdf(317.70 KB) Additional Information: full citation , abstract , index terms 

Ranking queries, also known as top-k queries, produce results that are ordered on some 
computed score. Typically, these queries Involve joins, where users are usually interested 
only in the top-/c join results. Top-k queries are dominant in many emerging applications, 
e.g., multimedia retrieval by content, Web databases, data mining, middlewares, and 
most information retrieval applications. Current relational query processors do not handle 
ranking queries efficiently, especia ... 

Keywords: Query operators, Rank aggregarion. Ranking, Top-k queries 



Web search 1: Usin g ODP metadata to personalize search 
Paul Alexandru Chlrita, Wolfgang Nejdl, Raluca Paiu, Christian Kohlschutter 
August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 

Research and development in information retrieval SIGIR '05 
Publisher: ACM Press 

Full text available: ^ pdf (310.29 KB) Additional Information: full citation , abstract , references , index terms 

The Open Directory Project is clearly one of the largest collaborative efforts to manually 
annotate web pages. This effort involves over 65,000 editors and resulted in metadata 
specifying topic and importance for more than 4 million web pages. Still, given that this 
number is just about 0.05 percent of the Web pages indexed by Google, is this effort 
enough to make a difference? In this paper we discuss how these metadata can be 
exploited to achieve high quality personalized web search. First, we ... 
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77 Pa per session IR-2 (information retrieval): question answering: Retrieving answers Q 
^ from frequently asked questions pages on the web 
^ Valentin Jljkoun, Maarten de Rijke 

October 2005 Proceedings of the 14th ACM international conference on Information 
and knowledge management CIKM '05 

Publisher: ACM Press 

Full text available:^ pdf (233.07 KB) Additional Information: full citation , abstract , references , index terms 

We address the task of answering natural language questions by using the large nunnber 
of Frequently Asked Questions (FAQ) pages available on the web. The task involves three 
steps: (1) fetching FAQ pages from the web; (2) automatic extraction of question/answer 
(Q/A) pairs from the collected pages; and (3) answering users' questions by retrieving 
appropriate Q/A pairs. We discuss our solutions for each of the three tasks, and give 
detailed evaluation results on a collected corpus of about 3.6Gb ... 

Keywords: FAQ retrieval, question answering, questions beyond factoids 



78 A semi-supervised document clustering technique for information organization 




Han-Joon Kim, Sang-Goo Lee 

Novennber 2000 Proceedings of the ninth international conference on Information and 
knowledge management 

Publisher: ACM Press 

Full text available: ^ pdf(261.4Q KB) Additional Information: full citation , references , citin gs, index terms 




Keywords: agglomerative hierarchical clustering, document clustering, fuzzy information 
retrieval, information organization, relevance feedback 



79 Semantics, ontologies & enterprise integration track: Concept abduction and 

contraction for semantic-based discovery of matches and negotiation spaces in an e- 
marketplace 

Simona Colucci, Tommaso Di Noia, Eugenio Di Sciascio, Marina Mongiello, Francesco M. 
Donini 

March 2004 Proceedings of the 6th international conference on Electronic commerce 
ICEC '04 

Publisher: ACM Press 

Full text available: ^ pdf(495.44 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we present a Description Logic approach to extended matchmaking between 
Demands and Supplies in an Electronic Marketplace, which allows the semantic-based 
treatment of negotiable and strict requirements in the description .To this aim we exploit 
two novel non-standard Description Logic inference services. Concept Contraction -which 
extends satisflability-and Concept Abduction -which extends subsumption. Based on these 
services we devise algorithms to find negotiation spaces and to de ... 

Keywords: concept abduction, concept contraction, description logics, e-commerce, 
matchmaking, negotiable constraints, semantic web 
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Su pporting activities: Proactive support for the organization of shared workspaces 
using activity patterns and content analysis 
Wolfgang Prinz, Baber Zaman 

November 2005 Proceedings of the 2005 international ACM SIGGROUP conference on 
Supporting group work GROUP '05 

Publislier: ACM Press 

Full text available:^ pdfd. 10 MB) Additional Information: full citation , abstract , references , index terms 

Shared workspace systenns provide virtual places for self-organized and sennl-structured 
cooperation between local and distributed team nnenribers. These cooperation systems 
have been adopted by a large community over the past years and the volume of managed 
information Is Increasing rapidly. However, a problem that occurs frequently is the 
missing user support for the workspace organization and a lack of assistance finding the 
right place for storing new documents and contributions. This often resul ... 

Keywords: awareness, content categorization for shared workspaces, noise detection for 
shared workspaces, office and workplace, semantic web for shared workspaces, shared 
workspaces, similar documents search, social computing and social navigation 
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Tree induction vs. logistic regression: a learning-curve analysis 
Claudia Perlich, Foster Provost, Jeffrey S. Simonoff 
December 2003 The Journal of Machine Learning Research, volume 4 
Publisher: MIT Press 

Additional Information: full citation , abstract , references , citings . Index 
terms 



Full text available: g pdf (263,37 KB) 



Tree induction and logistic regression are two standard, off-the-shelf methods for building 
models for classification. We present a large-scale experimental comparison of logistic 
regression and tree induction, assessing classification accuracy and the quality of rankings 
based on class-membership probabilities. We use a learning-curve analysis to examine 
the relationship of these measures to the size of the training set. The results of the study 
show several things. (1) Contrary to some prior o ... 
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A survey on wavelet a p plications in data nnining 
Tao Li, Qi Li, Shenghuo Zhu, Mitsunori Ogihara 

December 2002 ACM SIGKDD Explorations Newsletter, volume 4 issue 2 
Publisher: ACM Press 

Full text available:^ pdf ( 330. 06 KB ) Additional Information: full citation , abstract , references , citings 

Recently there has been significant development in the use of wavelet methods in various 
data mining processes. However, there has been written no comprehensive survey 
available on the topic. The goal of this is paper to fill the void. First, the paper presents a 
high-level data-mining frameworl< that reduces the overall process into smaller 
components. Then applications of wavelets for each component are reviewd. The paper 
concludes by discussing the impact of wavelets on data mining research an ... 

83 Text cate g orization for nnultiple users based on semantic features from a nnachine- 
^ readable dictionary 

^ Elizabeth D. Liddy, Woojin Paik, Edmund S. Yu 

July 1994 ACM Transactions on Information Systems (TOIS), volume 12 issue 3 
Publisher: ACM Press 

Additional Information: full citation , abstract, references , citin gs, index 
terms , review 

The text categorization module described here provides a front-end filtering function for 
the larger DR-LINK text retrieval system [Liddy and Myaeing 1993]. The model evaluates 



Full text available:^ pdf d. 17 MB ) 
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a large incoming stream of documents to determine which documents are sufficiently 
similar to a profile at the broad subject level to warrant more refined representation and 
matching. To accomplish this task, each substantive word in a text is first categorized 
using a feature set based on the semantic Subject Field ... 

Keywords: semantic vectors, subject field coding 



84 Full Technical Papers: Learning innplicit user interest hierarchy for context in 

^ personalization 

^ Hyoung R. Kim, Philip K. Chan 

January 2003 Proceedings of the 8th international conference on Intelligent user 
interfaces 

Publisher: ACM Press 

Full text available- pdf(191.53 KB) Additional Information: full citation , abstract , references , citings , index 

terms 

To provide a more robust context for personalization, we desire to extract a continuum of 
general (long-term) to specific (short-term) interests of a user. Our proposed approach is 
to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We 
devise a divisive hierarchical clustering (DHC) algorithm to group words (topics) into a 
hierarchy where more general interests are represented by a larger set of words. Each 
web page can then be assigned to nodes in the hierarchy f ... 

Keywords: clustering algorithm, concept clustering, user interest hierarchy, user profile 



85 Research track paper: Mining images on sennantics via statistical learning 
Jianping Fan, Hangzal Luo, Mohand-Said Hacid 

August 2005 Proceeding of the eleventh ACM SIGKDD international conference on 
Knowledge discovery in data mining KDD '05 

Publisher: ACM Press 

Full text available: pdf(817.66 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, we have proposed a novel framework to enable hierarchical innage 
classification via statistical learning. By integrating the concept hierarchy for semantic 
image concept organization, a hierarchical mixture modei is proposed to enable multi- 
level modeling of semantic image concepts and hierarchical classifier combination. Thus, 
learning the classifiers for the semantic image concepts at the high level of the concept 
hierarchy can be effectively achieved by detecting t ... 

Keywords: adaptive EM algorithm, hierarchical mixture model, image classification 



S6 Industry /gove rnment track pa per: Learnin g to predict train wheel failures 
^ Chunsheng Yang, Sylvain Letourneau 

>^ August 2005 Proceeding of the eleventh ACM SIGKDD international conference on 
Knowledge discovery in data mining KDD '05 

Publisher: ACM Press 

Full text available: ^pdf (1.13 MB ) Additional Information: full citation , abstract , references , index terms 

This paper describes a successful but challenging application of data mining in the railway 
industry. The objective is to optimize maintenance and operation of trains through 
prognostics of wheel failures. In addition to reducing maintenance costs, the proposed 
technology will help improve railway safety and augment throughput. Building on 
established techniques from data mining and machine learning, we present a 
methodology to learn models to predict train wheel failures from readily available ... 
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87 Tools & techniques track: browsing and visualizing collections: An initial evaluation of Q 
^ automated or g anization for di g ital library browsing 
^ Aaron Krowne, Martin Halbert 

June 2005 Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries 

Publisher: ACM Press 

Full text available: ^ pclf(243.36 KB) Additional Information: full citation , abstract , references , index terms 

In this article we present an evaluation of text clustering and classification methods for 
creating digital library browse interfaces, focusing on the particular case of collections 
made up of heterogeneous metadata records. This situation is common in "portal" style 
digital libraries, which are built by harvesting content from many disparate sources, 
typically using the Open Archives Protocol for Metadata Harvesting (OAI-PMH). By 
studying the activity of users in an experimental system, we find ... 

Keywords: NMF, browsing, categorization, classification, clustering, digital libraries, 
harvesting, portals, taxonomies 
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American Journal of Computational Linguistics Staff 
April 1983 Computational Linguistics, volume 9 issue 2 
Publisher: MIT Press 
Full text available* 
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A framework for specifyin g ex plicit bias for revision of a p proxinnate infornnat ion 
^ extraction rules 

^ Ronen Feldman, Yair Liberzon, Binyamin Rosenfeld, Jonathan Schler, Jonathan Stoppi 
August 2000 Proceedings of the sixth ACM SIGKDD international conference on 

Knowledge discovery and data mining 
Publisher: ACM Press 

Full text available: ^ pdf(1 73.48 KB) Additional Information: full citation , references , citings, index terms 



Keywords: information extraction, text mining, theory revision, user guided revision 



90 O ptlnnal aggre g ation al g orithnns for nniddlewar e | 
Ronald Fagin, Amnon Lotem, Moni Naor 

May 2001 Proceedings of the twentieth ACI^ SIGI^OD-SIGACT-SIGART symposium 
on Principles of database systems 

Publisher: ACM Press 

Full text available: 111 pdf(231 .47 KB) Additional Information: full citation , abstract , references , citings, index 

Assume that each object in a database has m grades, or scores, one for each of m 
attributes. For example, an object can have a color grade, that tells how red it is, and a 
shape grade, that tells how round it is. For each attribute, there is a sorted list, which lists 
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each object and its grade under that attribute, sorted by grade (highest grade first). 
There is some monotone aggregation function, or combining rule, such as min or average, 
that combines the individ ... 

Lar g e Mar g in Methods for Structured and Interdependent Output Variables 
loannls Tsochantarldis, Thorsten Joachims, Thomas Hofmann, Yasemin Altun 
September 2005 The Journal of Machine Learning Research, volume 6 
Publisher: IVIIT Press 

Full text available: ^ pdf ( 240.67 KB ) Additional Information: full citation , abstract 

Learning general functional dependencies between arbitrary input and output spaces is 
one of the key challenges in computational intelligence. While recent progress in machine 
learning has mainly focused on designing flexible and powerful input representations, this 
paper addresses the complementary issue of designing classification algorithms that can 
deal with more complex outputs, such as trees, sequences, or sets. More generally, we 
consider problems involving multiple dependent output varia ... 

92 Long pa pers: visualization and presentation: A g ra ph-matching a p proach to dynamic 
^ media allocation in intelligent multimedia interfaces 
^ Michelle X. Zhou, Zhen Wen, Vikram Aggarwal 

January 2005 Proceedings of the 10th international conference on Intelligent user 
interfaces 

Publisher: ACM Press 

Full text available: ^ pdf(1.14 MB) Additional Information: full citation , abstract , references , index terms 

To aid users in exploring large and complex data sets, we are building an intelligent 
multimedia conversation system. Given a user request, our system dynamically creates a 
multimedia response that is tailored to the interaction context. In this paper, we focus on 
the problem of media allocation, a process that assigns one or more media, such as 
graphics or speech, to best convey the intended response content. Specifically, we 
develop a graph-matching approach to media allocation, whose goal is ... 

Keywords: automated generation of multimedia presentations, intelligent multimedia 
interfaces, media allocation 



93 Multl Relational Data Mining ( MRDM): Probabilistic logic learning 
^ Luc De Raedt, Kristian Kersting 

N/ July 2003 ACM SIGKDD Explorations Newsletter volume 5 issue i 
Publisher: ACM Press 

Full text available: ^ p df (1.98 MB) Additional Information: full citation , abstract , references , citings 

The past few years have witnessed an significant interest in probabilistic logic learning, 
i.e. in research lying at the intersection of probabilistic reasoning, logical representations, 
and machine learning. A rich variety of different formalisms and learning techniques have 
been developed. This paper provides an introductory survey and overview of the state-of- 
the-art in probabilistic logic learning through the identification of a number of important 
probabilistic, logical and learning concept ... 

Keywords: data mining, inductive logic programming, machine learning, multi-relational 
data mining, probabilistic reasoning, uncertainty 

94 Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA 
4^ intrusion detection system evaluations as performed by Lincoln Laborator y 

^ November 2000 ACM Transactions on Information and System Security (TISSEC), 
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Volume 3 Issue 4 
Publisher: ACM Press 

Full text available- 13 pclf(1 56 16 KB) Additional Information: full citation , abstract , references , dtings, index 
• = terms , review 

In 1998 and again in 1999, the Lincoln Laboratory of MIT conducted a comparative 
evaluation of intrusion detection systenns (IDSs) developed under DARPA funding. While 
this evaluation represents a significant and nnonunriental undertaking, there are a nunnber 
of issues associated with Its design and execution that remain unsettled. Some 
methodologies used in the evaluation are questionable and may have biased its results. 
One problem is that the evaluators have published relatively little concer ... 

Keywords: computer security, intrusion detection, receiver operating curves (ROC), 
software evaluation 



95 The Anti-Mac interface 
Don Centner, Jakob Nielsen 

August 1996 Communications of the ACM, Volume 39 issue 8 
Publisher: ACM Press 

Full text available: « Ddf(m38KB) Additional Information: full citation , references , dtings. index terms . 
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Motion reco ver y for video content classification 
Nevenka Dimitrova, Forouzan Golshani 

October 1995 ACM Transactions on Information Systems (TOIS), volume i3 issue 4 
Publisher: ACM Press 

Full text available- Ddf(2 74 MB) Additional Information: full citation , abstract, references , citings, index 
'1^^^^ terms 

Like other types of digital information, video sequences must be classified based on the 
semantics of their contents. A more-precise and completer extraction of semantic 
Information will result in a more-effective classification. The most-discernible difference 
between still images and moving pictures stems from movements and variations. Thus, to 
go from the realm of still-image repositories to video databases, we must be able to deal 
with motion. Particularly, we need the ability to classi ... 

Keywords: MPEG compressed video analysis, content-based retrieval of video, motion 
recovery, video databases, video retrieval 



97 Industry track session: Feature-based reconnniendation systenn 
^ Eui-Hong (Sam) Han, George Karypis 

^ October 2005 Proceedings of the 14th ACM international conference on Information 
and knowledge management CIKM '05 

Publisher: ACM Press 

Full text available: ^pdf(1 05.58 KB) Additional Information: full citation , abstract, references , index terms 

The explosive growth of the world-wide-web and the emergence of e-commerce has led 
to the development of recommender systems—a personalized Information filtering 
technology used to identify a set of N items that will be of interest to a certain user. User- 
based and model-based collaborative filtering are the most successful technology for 
building recommender systems to date and is extensively used in many commercial 
recommender systems. The basic assumption in these algorithms is ... 

Keywords: collaborative filtering, e-commerce, product features, recommender systems. 
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98 Tree-Based Batch Mode Reinforcement Learning 
Damien Ernst, Pierre Geurts, Louis Wehenkel 

September 2005 The Journal of Machine Learning Research, Volume 6 
Publisher: MIT Press 

Full text available: ^ pdf(1.32 MB) Additional Information: full citation , abstract 

Reinforcement learning aims to determine an optimal control policy from interaction with 
a system or from observations gathered from a system. In batch mode, it can be 
achieved by approximating the so-called Q-function based on a set of four-tuples (x^, , 

^t^i^ where denotes the system state at time the control action taken, the 

instantane ... 




99 Tree-Based Batch Mode Reinforcement Learning 
Damien Ernst, Pierre Geurts, Louis Wehenkel 
April 2005 The Journal of Machine Learning Research, volume 6 
Publisher: MIT Press 

Full text available: ^ pdf(1.41 MB) Additional Information: full citation , abstract 

Reinforcement learning aims to determine an optimal control policy from interaction with 
a system or from observations gathered from a system. In batch mode, it can be 
achieved by approximating the so-called Q-function based on a set of four-tuples (x^, , 

^t+i^ where x^ denotes the system state at time t, the control action taken, the 

instantane ... 




100 Evaluatin g hy permedia and learning: methods and results from the Perseus Pro j ect 
Gary Marchionini, Gregory Crane 

January 1994 ACM Transactions on Information Systems (TOIS), volume 12 issue 1 
Publisher: ACM Press 

Full text available- 1i3i)df(2 57 MB) Additional Information: full citation, abstract, references , citing s, index 
* IM = terms , review 

The Perseus Project has developed a hypermedia corpus of materials related to the 
ancient Greek world. The materials include a variety of texts and images, and tools for 
using these materials and navigating the sytem. Results from a three-year evaluation of 
Perseus use in a variety of college settings are described. The evaluation assessed both 
this particular system and the application of the technological genre to information 
management and to learning. The evaluation used a variety of me ... 

Keywords: human-computer interaction, hypermedia, learning, teaching 
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10'* Bitext correspondences through rich mark-up 
Raquel Martinez, Joseba Abaitua, Arantza Casillas 

August 1998 Proceedings of the 17th international conference on Computational 
linguistics - Volume 2 , Proceedings of the 36th annual meeting on 
Association for Computational Linguistics - Volume 2 

Publisher: Association for Computational Linguistics . Association for Computational Linguistics 
Full text available:^ pdf (651 .98 KB) 



Additional Information: full citation, abstract, references 



' Publisher Site 



Rich mark-up can considerably benefit the process of establishing bitext correspondences, 
that is, the task of providing correct identification and alignment methods for text 
segments that are translation equivalences of each other in a parallel corpus. We present 
a sentence alignment algorithm that, by taking advantage of previously annotated texts, 
obtains accuracy rates close to 100%. The algorithm evaluates the similarity of the 
linguistic and extralinguistic mark-up In both sides of a bitex ... 

102 Efficient discovery of error-tolerant frequent itennsets in hi g h dimensions 
Cheng Yang, Usama Fayyad, Paul S. Bradley 

August 2001 Proceedings of the seventh ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Publisher: ACM Press 

Additional Information: full citation, abstract , references, citings, index 
terms 



Full text available:^ pdf (1.11 MB) 



We present a generalization of frequent itemsets allowing for the notion of errors in the 
itemset definition. We motivate the problem and present an efficient algorithm that 
identifies error-tolerant frequent clusters of items in transactional data (customer- 
purchase data, web browsing data, text, etc.). The algorithm exploits sparseness of the 
underlying data to find large groups of items that are correlated over database records 
(rows). The notion of transaction coverage allows us to extend th ... 

Keywords: Error-tolerant frequent itemset, clustering, collaborative filtering, high 
dimensions, query selectivity estimation 
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August 1997 Proceedings of the conference on Designing interactive systems: 

processes, practices, methods, and techniques 
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Full text available: ^ pdf(778.76 KB) Additional Information: full citation , references , index ternns 



Keywords: attribute importance, smart product, usability 



104 Session 1: creative mathematics: Secure multi-party computation problems and their Q 
^ ap plications: a review and open problems 
^ Wenliang Du, Mikhail J. Atallah 

September 2001 Proceedings of the 2001 workshop on New security paradigms 

Publisher: ACM Press 

Full text available: « pdf(909.09 KB^ Additional Information: full citation, abstract, references , o^s. index 

terms 

The growth of the Internet has triggered tremendous opportunities for cooperative 
computation, where people are jointly conducting computation tasks based on the private 
inputs they each supplies. These computations could occur between mutually untrusted 
parties, or even between competitors. For example, customers might send to a remote 
database queries that contain private information; two competing financial organizations 
might jointly invest in a project that must satisfy both organizations' ... 

Keywords: privacy, secure multi-party computation 



Cluster ensembles — a knowledge reuse framework for combinin g multiple partitions Q 
Alexander Strehl, Joydeep Ghosh 

March 2003 The Journal of Machine Learning Research, volume 3 
Publisher: MIT Press 

F II t xt V liable* df(842 50 KB) A*^^'*'^"^' Information: full citation , abstra ct, references , citin gs, i ndex 
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This paper introduces the problem of combining multiple partitionings of a set of objects 
into a single consolidated clustering without accessing the features or algorithms that 
determined these partitionings. We first identify several application scenarios for the 
resultant 'knowledge reuse* framework that we call cluster ensembles. The cluster 
ensemble problem is then formalized as a combinatorial optimization problem in terms of 
shared mutual information. In addition to a direct ... 

Keywords: cluster analysis, clustering, consensus functions, ensemble, knowledge reuse, 
multi-learner systems, mutual information, partitioning, unsupervised learning 
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Philip Koltun, Lionel E. Deimel, Jo Perry 

February 1983 ACi^ SIGCSE Bulletin , Proceedings of the fourteenth SIGCSE technical 
symposium on Computer science education SIGCSE '83, volume is issue i 
Publisher: ACM Press 

Full text available- Ddf(782 40 KB) Additional Information: full citation , abstract , references , citings , index 
'1^^^^ — terms 

We present some ideas here about prose reading comprehension tests, with analogies to 
program reading exercises, and suggest the potential usefulness of a standardized, 
nationwide program reading comprehension test as a means to assess on a comparative 
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basis individual and department-wide progress through the computer science curriculum. 
We conclude with a research agenda on program reading and encourage contributions to 
the work from interested colleagues. 

"^^^ Quantitative evaluation of software qualit y 
B. W. Boehm, J. R. Brown, M. Lipow 

October 1976 Proceedings of the 2nd international conference on Software 
engineering 

Publisher: IEEE Computer Society Press 

Full text available* IS pdft1_44_MB) Additional Information: full citation , abstract , references , citings , index 

terms 

The study reported in this paper establishes a conceptual framework and some key initial 
results in the analysis of the characteristics of software quality. Its main results and 
conclusions are: • Explicit attention to characteristics of software quality can lead to 
significant savings in software life-cycle costs. • The current software state-of-the-art 
imposes specific limitations on our ability to automatically and quantitatively evaluate the 
quality of so ... 

Keywords: Management by objectives, Quality assurance, Quality characteristics, Quality 
metrics, Software engineering, Software measurement and evaluation, Software quality. 
Software reliability. Software standards. Testing 
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Fabian Monrose, Aviel Rubin 

April 1997 Proceedings of the 4th ACM conference on Computer and communications 
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Publisher: ACM Press 

Full text available:^ pdf (1.1 8 MB) Additional Information: full citation , references , citin gs, index terms 
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>^ May 1997 Communications of the ACM, volume 40 issue 5 
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Full text available: m Pdf(676.39 KB) Additional Information: full citation , references , citings, index terms , 
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COTS decision-making: a goal-driven requirements engineering perspective 
Carina Alves, Anthony Finkelstein 

July 2002 Proceedings of the 14th international conference on Software engineering 
and knowledge engineering SEKE '02 

Publisher: ACM Press 

Full text available: ^ pdf(1.07 MB) Additional Information: full citation , abstract , references 

This position paper outlines the problenns and risks of selecting COTS products. In 
particular, we highlight the challenges of the decision-making process where requirements 
specification plays an essential role to evaluate and compare products features. It is 
necessary to perform a careful balancing between requirements and COTS features. 
Customers may have to compromise on requirements not satisfied by any available 
product or request products modifications. We analyse the problems and risks ar ... 
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August 1994 ACM SIGMIS Database, volume 25 issue 3 
Publisher: ACM Press 

Fail text avaiiable: ^ pclf ( 1.78 IVIB ) Additional Information: full citation , abstract , index terms 

I^IS projects are seiected by any of four different groups within organizations: top 
management, steering committees, user departments, and i^IS departments. Because of 
their inherent differences, each of these groups is iikeiy to favor different types of MIS 
projects. That is, they exhibit different selection biasing. An investigation of the nature 
and extent of this biasing is examined in this research. Data were collected from 176 MIS 
projects selected from 60 organizations. Projects were catego ... 

Keywords: MIS project selection, application system selection, resource allocation, 
steering committees, top management involvement 
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Publislier: MIT Press 

Full text available:^ ..... rS 

Tlg^lmt ^ Q^ Mb },^ Additional Information: full citation , abstract , references , citin gs 
Publisher Site 

This paper presents empirical results comparing spoken and keyboard communication. It 
is shown that speakers attempt to achieve more detailed goals in giving instructions than 
do users of keyboards. One specific kind of fine-grained communicative act, a request that 
the hearer identify the referent of a noun phrase, is shown to dominate spoken 
instruction-giving discourse, but is nearly absent from keyboard discourse. Most 
important, these requests are only achieved "indirectly". — through utte ... 

'I ^ 3 A comparative analysis of MIS pro ject selection mechanisms | 
James D. McKeen, Tor Guimaraes, James C. Wetherbe 
February 1994 ACi^ SIGMIS Database, volume 25 issue i 
Publisher: ACM Press 

Full text available: ^ pdf (1.63 MB) Additional Information: full citation , abstract , index terms 

MIS projects are selected by any of four different groups within organizations— top 
management, steering committees, user departments and MIS departments. Because of 
their inherent differences, each of these groups is likely to favor different types of MIS 
projects. That is, they exhibit different selection biasing. An investigation of the nature 
and extent of this biasing is examined in this research. Data were collected from 176 MIS 
projects selected from 60 organizations. Projects were catego ... 

Keywords: MIS project selection, application system selection, resource allocation, 
steering committees, top management involvement 
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Full text available: ^ pdf(2.76 MB) Additional Information: full citation , abstract , references , index terms 

The use of logic in identifying and analyzing Inconsistency in requirennents from multiple 
stakeholders has been found to be effective in a number of studies. Nonmonotonic logic is 
a theoretically well-founded formalism that is especially suited for supporting the 
evolution of requirements. However, direct use of logic for expressing requirements and 
discussing them with stakeholders poses serious usability problems, since in most cases 
stakeholders cannot be expected to be fluent with formal log ... 

Keywords: Requirements, default logic, inconsistency, natural language 
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pro jection 

Ellen Riloff, Charles Schafer, David Yarowsky 

August 2002 Proceedings of the 19th international conference on Computational 
linguistics - Volume 1 

Publisher: Association for Computational Linguistics 

Full text available: ^pdf( 1 10.85 KB ) Additional Information: full citation , abstract , references 

Information extraction (IE) systems are costly to build because they require development 
texts, parsing tools, and specialized dictionaries for each application domain and each 
natural language that needs to be processed. We present a novel method for rapidly 
creating IE systems for new languages by exploiting existing IE systems via cross- 
language projection. Given an IE system for a source language (e.g., English), we can 
transfer its annotations to corresponding texts in a target language (e. ... 



Exploring the applications of user-expertise assessment for intelligent i nterfaces 




Michel C. Desmarais, Jiming Liu 

May 1993 Proceedings of the SIGCHI conference on Human factors in computing 

systems 
Publisher: ACM Press 

Full text available: ^ pdf(639.86 KB) Additional Information: full citation , abstract , references , index terms 

An adaptive user interface relies, to a large extent, upon an adequate user model (e.g., a 
representation of user-expertise). However, building a user model may be a tedious and 
time consuming task that will render such an interface unattractive to developers. We 
thus need an effective means of inferring the user model at low cost. In this paper, we 
describe a technique for automatically inferring a fine-grain model of a user's knowledge 
state based on a small number of observations. With t ... 

Keywords: adaptive training systems, entropy, evidence aggregation, intelligent 
interfaces, knowledge spaces, probabilistic reasoning, user-expertise assessment 
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August 1996 Proceedings of the 16th conference on Computational linguistics - 
Volume 1 
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Full text available: ^ pdf(660.29 KB) Additional Information: full citation , abstract , references 

The development of natural language processing (NLP) systems that perform machine 
translation (MT) and information retrieval (IR) has highlighted the need for the automatic 
recognition of proper names. While various name recognizers have been developed, they 
suffer from being too limited; some only recognize one name class, and all are language 
specific. This work develops an approach to multilingual name recognition that allows a 
system optimized for one language to be ported to another with li ... 
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Learning about hidden events in system interactions | 
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May 1986 ACM SIGCHI Bulletin , Proceedings of the SIGCHI/GI conference on 

Human factors in computing systems and graphics interface CHI '87, volume 

17 Issue SI 

Publisher: ACM Press 

Full text available: m Ddf(506.37 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

Understanding how to use a computer system often requires knowledge of hidden events: 
things which happen as a result of user actions but which produce no immediate 
perceptible effect. How do users learn about these events? Will learners explain the 
mechanism in detail or only at the level at which they are able to use it? We extend Lewis' 
EXPL model of causal analysis, incorporating ideas from Miyake, Draper, and Dietterich, to 
give an account of learning about hidden events from examples. ... 

Keywords: example-based learning, explanations, models of learning 



^ ^ Cluster ensemble and its applications in g ene expression analysis 
Xiaohua Hu, Illhoi Yoo 

January 2004 Proceedings of the second conference on Asia-Pacific bioinformatics - 
Volume 29 CRPIT '04 

Publisher: Australian Computer Society, Inc. 

Full text available:^ pdf (377.08 KB ) Additional Information: full citation , abstract , references 

Huge amount of gene expression data have been generated as a result of the human 
genomic project. Clustering has been used extensively in mining these gene expression 
data to find important genetic and biological information. Obtaining high quality clustering 
results is very challenging because of the inconsistency of the results of different 
clustering algorithms and noise in the gene expression data. Many clustering algorithms 
are available and different clustering algorithms may generate diff ... 

Keywords: cluster ensemble, gene expression analysis, graph partition 
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November 2003 Proceedings of the eleventh ACM international conference on 
Multimedia 

Publisher: ACM Press 
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Full text available: 153 pdf (275.37 KB) ' 

terms 

Providing accurate and scalable solutions to map low-level perceptual features to high- 
level semantics is critical for multimedia information organization and retrieval. In this 
paper, we propose a confidence-based dynamic ensemble (CDE) to overcome the 
shortcomings of the traditional static classifiers. In contrast to the traditional models, CDE 
can make dynamic adjustments to accommodate new semantics, to assist the discovery of 
useful low-level features, and to improve class-prediction ... 
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121 Poster 3: content track: Highlight ranking for sports video browsing Q 
^ Xiaofeng Tong, Qingshan Liu, Yifan Zhang, Hanging Lu 

November 2005 Proceedings of the 13th annual ACM international conference on 
Multimedia MULTIMEDIA '05 

Publisher: ACM Press 

Full text available:*^ pdf(344. 00 KB ) Additional Information: full citation , abstract, references , index terms 

Sports video has been extensively studied for its wide viewer-ship and tremendous 
commercial potentials. Many studies focused on highlight extraction for summarizing a 
lengthy video. In this paper, we present an advanced highlight analysis system for sports 
video browsing, in which highlight evaluation and ranking are concerned besides highlight 
detection. First, we use replay detection to efficiently localize the highlights. Then 
incorporating with domain-specific knowledge, we adopt several si ... 

Keywords: highlight ranking, replay detection, video browsing 
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Richard B. Kieburtz, Laura McKlnney, Jeffrey M. Bell, James Hook, Alex Kotov, Jeffrey Lewis, 
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May 1996 Proceedings of the 18th international conference on Software engineering 
Publisher: IEEE Computer Society 



Full text available: 



^pdf( 1.15MB ) 
Publisher Site 



Additional Information: full citation , abstract , references , citin gs, index 
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The paper presents results of a software engineering experiment in which a new 
technology for constructing program generators from domain-specific specification 
languages has been compared with a reuse technology that employs sets of reusable Ada 
program templates. Both technologies were applied to a common problem domain, 
constructing message translation and validation modules for military command, control, 
communications and information systems (C/sup 3/1). The experiment employed four 
subject ... 

Keywords: flexibility, productivity, reliability, software component generation, usability 
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July 2001 Proceedings of the 23rd International Conference on Software 
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Full text available:^ pdf d 10.12 KB) 
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^ Publisher Site 

In 1996, use switched its core two-semester software engineering course from a 
hypothetical-project, homework-and-exam course based on the Bloom taxonomy of 
educational objectives (knowledge, comprehension, application, analysis, synthesis, 
evaluation). The revised course is a real-client team-project course based on the CRESST 
model of learning objectives (content understanding, problem solving, collaboration, 
communication, and self-regulation). We used the CRESST cognitive demands analysis ... 

Keywords: process models, product models, project courses, property models, risk 
management, software engineering education, success models 
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This paper reports the views of 80 senior IT managers about IT evaluation approaches, 
and the benefits that IT provides for their organizations. Their views were obtained 
through a survey mailed to medium to large organizations in both Europe and the US. The 
survey sought answers to three questions: How does the senior IT manager's 
organization assess the value of its (1) overall IT investment portfolio? (2) individual IT 
projects and applications? (3) IT function? Questions for the survey were ... 

Keywords: IS effectiveness, IS evaluation, IS success, IT benchmarking, IT evaluation, 
IT outsourcing, balanced scorecard, feasibility studies, post-implementation review 



125 The pailicipatory desi g n of a sound and ima g e enhanced daily planner for peo ple 
^ with aphasia 

^ Karyn l^offatt, Joanna McGrenere, Barbara Purves, Maria Klawe 

April 2004 Proceedings of the SIGCHI conference on Human factors in computing 

systems 
Publisher: ACiVI Press 

I- II * * -I ui 01 ^f/H no h>iDx Additional Information: full citation , abstract , references , citin gs, index 
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Aphasia is a cognitive disorder that impairs speech and language. From interviews with 
aphasic individuals, their caregivers, and speech-language pathologists, the need was 
identified for a daily planner that allows aphasic users to independently manage their 
appointments. We used a participatory design approach to develop ESI Planner (the 
Enhanced with Sound and Images Planner) for use on a PDA and subsequently evaluated 
it in a lab study. This methodology was used in order to achieve both usab ... 

Keywords: assistive technology, cognitive disabilities, handheld devices, multi-modal 
interaction, participatory design, universal usability 
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